A Keyword Filters Method for Spam via Maximum Independent Sets
نویسندگان
چکیده
In order to evade the keyword filtering, the spammers insert comments into e-mails, such as unusual symbols # or ※, to divide some keywords. In the paper, one keyword filters method for spam via maximum independent sets is presented, and the main contents include: (1) build a matching relation matrix algorithm to help us to improve the performance of maximal independent sets; (2) develop a judgmental criterion according to the matching relation matrix algorithm. (3) design a behavior recognition technology, which can detect and reject the email which receiving. Proved by the experiments and analyses of examples, the space and time complexity of this algorithm is much smaller than 0 (mn). The operating efficiency is also satisfactory, and is able to achieve the complete filtering of targeted unusual symbols during the e-mail keyword filtering.
منابع مشابه
Learning to Filter Spam E-Mail: A Comparison of a Naive Bayesian and a Memory-Based Approach
We investigate the performance of two machine learning algorithms in the context of antispam filtering. The increasing volume of unsolicited bulk e-mail (spam) has generated a need for reliable anti-spam filters. Filters of this type have so far been based mostly on keyword patterns that are constructed by hand and perform poorly. The Naive Bayesian classifier has recently been suggested as an ...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملDenial of Information Attacks in Event Processing
Automated Denial of Information Attacks. It is a common assumption in event processing that the events are “clean”, i.e., they come from well-behaved and trustworthy sources. This assumption does not hold in all major open communications media for several reasons. First, adversaries may spread massive noise data, e.g., in email spam. Second, adversaries may inject potentially interesting, but o...
متن کاملAccurate Spam Mail Detection
With the increasing popularity of a E-mail users, E-mail spam problem growing proportionally. Spam filtering with near duplicate matching scheme is widely discussed in recent years. It is based on a known spam database formed by user feedback which cannot fully catch the evolving nature of spam and also it requires much storage. In view of above drawbacks, we proposed an effective spam detectio...
متن کاملYork University at TREC 2005: SPAM Track
We propose a variant of the k-nearest neighbor classification method, called instance-weighted k-nearest neighbor method, for adaptive spam filtering. The method assigns two weights, distance weight and correctness weight, to a training instance, and makes use of the two weights when classifying a new email. The correctness weight is also used in the maintenance of the training data to make the...
متن کامل